Robust-adaptive dynamic programming-based time-delay control of autonomous ships under stochastic disturbances using an actor-critic learning algorithm
نویسندگان
چکیده
Abstract This paper proposes a hybrid robust-adaptive learning-based control scheme based on Approximate Dynamic Programming (ADP) for the tracking of autonomous ship maneuvering. We adopt Time-Delay Control (TDC) approach, which is known as simple, practical, model free and roughly robust strategy, combined with an Actor-Critic (ACADP) algorithm adaptive part in proposed algorithm. Based this integration, (AC-TDC) proposed. It offers high-performance approach path following ships under deterministic stochastic disturbances induced by winds, waves, ocean currents. Computer simulations have been conducted two different conditions terms all simulation results indicate acceptable performance paths comparison conventional TDC approach.
منابع مشابه
Dynamic Control with Actor-Critic Reinforcement Learning
4 Actor-Critic Marble Control 4 4.1 R-code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.2 The critic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.3 Unstable actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.4 Trading off stability against...
متن کاملRobust dynamic positioning of ships with disturbances under input saturation
In the presence of unknown time-varying disturbances and input saturation, this paper develops a robust nonlinear control law for the dynamic positioning (DP) system of ships using a disturbance observer, an auxiliary dynamic system, and the dynamic surface control (DSC) technique. The disturbance observer is constructed to provide the estimates of unknown time-varying disturbances, the auxilia...
متن کاملAutonomous agent learning using an actor-critic algorithm and behavior models
We introdu e a Supervised Reinfor ement Learning (SRL) algorithm for autonomous learning problems where an agent is required to deal with high dimensional spa es. In our learning algorithm, behavior models learned from a set of examples, are used to dynami ally redu e the set of relevant a tions at ea h state of the environment en ountered by the agent. Su h subsets of a tions are used to guide...
متن کاملAn Actor-critic Algorithm for Learning Rate Learning
Stochastic gradient descent (SGD), which updates the model parameters by adding a local gradient times a learning rate at each step, is widely used in model training of machine learning algorithms such as neural networks. It is observed that the models trained by SGD are sensitive to learning rates and good learning rates are problem specific. To avoid manually searching of learning rates, whic...
متن کاملACtuAL: Actor-Critic Under Adversarial Learning
Generative Adversarial Networks (GANs) are a powerful framework for deep generative modeling. Posed as a two-player minimax problem, GANs are typically trained end-to-end on real-valued data and can be used to train a generator of high-dimensional and realistic images. However, a major limitation of GANs is that training relies on passing gradients from the discriminator through the generator v...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of marine science and technology
سال: 2021
ISSN: ['2709-6998', '1023-2796']
DOI: https://doi.org/10.1007/s00773-021-00813-1